Skip to content

Ashwin/patch spark3 jars#17

Open
ash7bhide wants to merge 5 commits intoaffirm-3.1.2from
ashwin/patch-spark3-jars
Open

Ashwin/patch spark3 jars#17
ash7bhide wants to merge 5 commits intoaffirm-3.1.2from
ashwin/patch-spark3-jars

Conversation

@ash7bhide
Copy link
Copy Markdown

What changes were proposed in this pull request?

  • Propagate spark jars to executors (bug in current spark version for k8s)
  • Change all "pyspark" references to "pyspark3"
  • Fix k8s auth issue which resulted in 401 between driver and spark-submit client

Copy link
Copy Markdown

@metra metra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with some comments. I don't have much context to review this.

// via the k8s application, like in cluster mode driver
childClasspath ++= resolvedMavenCoordinates.split(",")
} else {
// In K8s client mode, when in the driver, add resolved jars early as we might need
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unindent

for (jar <- resolvedMavenCoordinates.split(",")) {
addJarToClasspath(jar, loader)
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why you added if (isKubernetesClusterModeDriver) to the else branch but I presume it's for a good reason.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because when spark-submit is in the k8s driver - deployMode == client

val isKubernetesCluster = clusterManager == KUBERNETES && deployMode == CLUSTER
val isKubernetesClient = clusterManager == KUBERNETES && deployMode == CLIENT
 val isKubernetesClusterModeDriver = isKubernetesClient &&
      sparkConf.getBoolean("spark.kubernetes.submitInDriver", false)

# limitations under the License.

__version__ = "3.1.2+affirm4"
__version__ = "3.1.2+affirm8"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: what happened to 5 through 7?

<artifactId>jackson-dataformat-yaml</artifactId>
</exclusion>
</exclusions>
</dependency>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ported from the mainline?

return serviceHost != null && serviceHost.length > 0
}

def getHomeDir(): String = {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where's this coming from?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what sets it, but it's always present on k8s containers. We have something similar in ml_pipelines

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant more are you writing this code from scratch or are you porting it from somewhere?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it's coming from @daggertheog's diff for the spark 2 k8s auth issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants